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Abstract 

We present a simple algorithm which maintains the topological 
order of a directed acyclic graph with n nodes under an online 
edge insertion sequence in 0(n^'^^) time, independent of the num- 
ber of edges m inserted. For dense DAGs, this is an improvement 
over the previous best result of 0(min{m2 logn, ma + logn}) 
by Katriel and Bodlaender. We also provide an empirical compar- 
ison of our algorithm with other algorithms for online topological 
sorting. Our implementation outperforms them on certain hard 
instances while it is still competitive on random edge insertion 
sequences leading to complete DAGs. 



1 Introduction 



A topological order T of a given directed acyclic graph (DAG) G = (V, E) 
(with n := \V\ and m := \E\) is a linear ordering of its nodes such that for 
all directed paths from x E V to y E V {x y), it holds that T{x) < T{y). 
There exist well known algorithms for computing the topological ordering of 
a DAG in 0{m + n) in an offline setting (see e.g. j2]). 
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In the online variant of this problem, the edges of the DAG are not known 
in advance but are given one at a time. Each time an edge is added to the 
DAG, we are required to update the bijective mapping T. 

The online topological ordering has been studied in the following contexts 

• As an online cycle detection routine in pointer analysis |13j. 

• Incremental evaluation of computational circuits 

• Compilation jHl 02] where dependencies between modules are main- 
tained to reduce the amount of recompilation performed when an up- 
date occurs. 



The naive way of maintaining an online topological order, i.e., to compute it 
each time from scratch with the offline algorithm, takes 0{m? + mn) time. 
Marchetti-Spaccamela et al. [H] (MNR) gave an algorithm that can insert 
m edges in 0{mn) time. Alpern et al. proposed a different algorithm P 
(AHRSZ) which runs in 0(||(5|| log \\S\\) time per edge insertion with \\S\\ mea- 
suring the number of edges of the minimal node subgraph that needs to be 
updated. Note that not all edges of this subgraph need to be visited and 
hence even time per insertion is not optimal. Katriel and Bodlaender 

(KB) (3 analyzed a variant of the AHRSZ algorithm and obtained an upper 
bound of 0(min{m2 logn, ma +n^logr;,}) for a general DAG. In addition, 
they show that their algorithm runs in time 0{m ■ k ■ log^ n) for a DAG for 
which the underlying undirected graph has a treewidth k. Also, they give an 
O(nlogn) algorithm for DAGs whose underlying undirected graph is a tree. 
The algorithm by Pearce and Kelly ^21 (PK) empirically outperforms the 
other algorithms for sparse random DAGs, although its worst-case runtime 
is inferior to KB. 

We propose a simple algorithm that works in 0{n'^^''^^^/logn) time and O(n^) 
space, thereby improving upon the results of Katriel and Bodlaender for 
dense DAGs. With some simple modifications in our data structure, we 
can get 0(n^''^^) time with 0(n^'^^) space or 0(n^''^^) expected time with 
O(n^) space. We also demonstrate empirically that this algorithm clearly 
outperforms MNR, AHRSZ, and PK on a certain class of hard sequences of 
edge insertions, while being competitive on random edge sequences leading 
to complete DAGs. 
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Our algorithm is dynamic, as it also supports deletion. However, our analysis 
holds only for a sequence of insertions. Our algorithm can also be used for 
online cycle detection in graphs, as well. Moreover, it permits an arbitrary 
starting point, which makes a hybrid approach possible, i.e., using the PK 
or KB algorithm for sparse graphs and ours for dense graphs. 

The rest of this paper is organized as follows. In Section |21 we describe 
the algorithm and the data structures involved. In Section El we give the 
correctness argument for our algorithm, followed by an analysis of its runtime 
in Sections |3] and El The details of our implementation and a empirical 
comparison with other algorithms follow in Section IHl 



2 Algorithm 

We keep the current topological order as a bijective function, T : V ^ [l..n\. 
If we start with an empty graph, we can initialize T with an arbitrary permu- 
tation, otherwise T is the topological order of the starting graph, computed 
offline. In this and the subsequent sections, we will use the following nota- 
tions: d{u,v) denotes \T{u) — T{v)\, u < t> is a short form of T{u) < T{v), 
M — *• f denotes an edge from u to v, and v expresses that v is reachable 
from u. Note that u ^ u, but not u u. 

Figure HI gives the pseudo code of our algorithm. Throughout the process of 
inserting new edges, we maintain some data structures which are dependent 
on the current topological order. Inserting a new edge (m, v) is done by 
calling Insert('U, f ). If f > m, we do not change anything in the current 
topological order and simply insert the edge into the graph data structure. 
Otherwise, we call Reorder to update the topological order as well as the 
data structures dependent on it. As we will prove in Theorem EJ detecting 
V = u indicates a cycle. If v < u, we first collect sorted sets A and B as 
defined in the code. If both A and B are empty, we swap the topological order 
of the two nodes and update the data structures. The query and the update 
operations are described in more detail along with our data structures in 
Section im Otherwise, we recursively call Reorder until everything inside 
is topologically ordered. To make these recursive calls efficient, we first merge 
the sorted sets {v} U A and B U {u} and using this merged list, compute the 
set {u' : {u' e BU {u}) A {u' > v')} for each node v' G {v} U A. 
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Insert(-u, t;) 

> Insert edge {u, v) and calculate new topological order 

1 ii V <u then Reorder(m,w) 

2 insert edge {u, v) in graph 

Reorder(m, v) 

> Reorder nodes between u and v such that v < u 

1 if u = V then report detected cycle and quit 

2 A := : — > w and w < u} 

3 S := {w : w — > and < w} 

4 if A = and 5 = 



then > Correct the topological order 

5 swap u and v 

6 update the data structure 

else > Reorder node pairs between u and v 

7 for v' G {v} U A in decreasing topological order 

8 for u' E B U {u} Au' > v' in increasing topological order 

9 Reorder(m',v') 



Figure 1: Our algorithm 
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2.1 Data structure 

We store the current topological order, as a set of two arrays, storing the 
bijective mapping T and its inverse. This ensures that finding T{i) and 
T~^{u) are constant time operations. 

The graph itself is stored as an array of vertices. For each vertex we maintain 
two adjacency lists, which keep the incoming and outgoing edges separately. 
Each adjacency list is stored as an array of buckets of vertices. Each bucket 
contains at most t nodes for a fixed t. Depending on the concrete implemen- 
tation of the buckets, the parameter t is later chosen to be approximately 
n°-^^ so as to balance the number of inserts and deletes from the buckets 
and the extra edges touched by the algorithm. The i-th bucket [i > 0) of a 
node u contains all adjacent nodes v with i ■ t < d{u,v) < {i + 1) ■ t. The 
nodes of a bucket are stored with node index (and not topological order) as 
their key. The bucket can be kept as a balanced binary tree or as an array 
of ra-bits or as a hash-table of a universal hashing function. The bucket data 
structure should provide efficient support for the following three operations: 

1. Insert: Inserting an element in a given bucket. 

2. Delete: Given an element and a bucket, find out if that element exists 
in that bucket. If yes, delete the element from there and return 1. Else, 
return 0. 

3. Collect-all: Copying all the elements from the bucket to some vector. 

Depending on how we choose to implement the buckets, we get different 
runtimes. This will be discussed in Section |S] We will now discuss how 
we do the insertion of an edge, computation of A and B, and updating the 
data-structure under swapping of nodes in terms of the above three basic 
operations. 

Inserting an edge (m, v) means, inserting node v to the forward adjacency list 
of u and u to the backward adjacency list of v. This requires 0(1) bucket 
inserts. 

For given u and v, the set A := {w : v w and w < u} sorted according 
to the current topological order can be computed from the adjacency list of 
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V by sorting all nodes of the first \d{u,v)/t\ outgoing buckets and choosing 
all w with w < u. This can be done by 0{d{u,v) /t) collect-all operations on 
buckets collecting a total of 0(|A| +t) elements. These elements are integers 
in the range {1 . . n} and can be sorted in 0{\A\ + t + y/n) time using a two- 
pass radix sort algorithm. The set B is computed likewise from the incoming 
edges. 

When we swap two nodes u and v, we need to update the adjacency lists 
of u and v as well as that of all nodes w that are adjacent to u and/or v. 
First, we show how to update the adjacency lists of u and v. If d{u,v) > t, 
we have to build their adjacency lists from scratch. Otherwise, the new 
bucket boundaries will differ from the old boundaries by d{u, v) and at most 
d{u, v) nodes will need to be transferred between any pair of consecutive 
buckets. The total number of transfers are therefore bounded by d{u, v) \n/t\. 
Determining whether a node should be transferred can be done in 0(1) using 
the inverse mapping and as noted above, a transfer can be done in 0(1) 
bucket inserts and deletes. Hence, updating the adjacency lists of u and v 
needs m.va.{n,d{u,v)\n/t\} bucket inserts and deletes. 

Let -u; be a node which is adjacent to u or v. Its adjacency list needs to 
be updated only if u and v are in different buckets. This corresponds to w 
being in different buckets of the adjacency lists of u and v. Therefore, the 
number of nodes to be transferred between different buckets for maintain- 
ing the adjacency lists of all w's is the same as the number of nodes that 
need to be transferred for maintaining the adjacency lists of u and v, i.e., 
min{n, d{u, v) \n/t\}. 

Updating the mappings T and after such a swap is trivial and can be 
done in constant time. Thus, we conclude that swapping nodes u and v can 
be done by 0{d{u,v)\n/t\) bucket inserts and deletes. 
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3 Correctness 



Theorem 1. The above algorithm returns a valid topological order after each 
edge insertion. 

Proof. For a graph with no edges, any ordering is a correct topological order, 
and therefore, the theorem is trivially correct. Assuming that we have a valid 
topological order of a graph G, we show that when inserting a new edge {u, v) 
using Insert(m, ti), our algorithm maintains the correct topological order of 
G' := G U {{u,v)}. U u < V, this is trivial. 

We need to prove that x < y for all nodes x, y of G' with x ^ y. If there was 
a path X ^ y m G, Lemma |21 gives x < y. Otherwise (if there is no x ?/ in 
G), the path x ^ y must have been introduced to G' by the new edge (m, v). 
Hence x < y in G' hj Lemma El since there is x ^ u —* v ^ y in G' . □ 

Lemma 2. Given a DAG G and a valid topological order. If u ^ v and 
u < V, then all subsequent calls to Reorder will maintain u < v. 

Proof. Let us assume the contrary. Consider the first call of Reorder which 
leads io u > V. Either this call led to swapping u and w with f < w or it 
caused swapping w and v with w < u. Note that in our algorithm, a call of 
Reorder(m, v) leads to a swapping only if A = and 5 = 0. Assuming that 
it was the first case (swapping u and w) caused by the call to REORDER(ti, w), 
A = 0. However, x & A for an x with u ^ x ^ v, leading to a contradiction. 
The other case is proved similarly. 

□ 

Lemma 3. Given a DAG G with v ^ y and x ^ u, a call of Reorder(u, v) 
will ensure that x < y. 

Proof. The proof follows by induction on the recursion depth of Reorder(u, v). 
For leaf nodes of the recursion tree, A = B = ^. li x < y before this call 
happens. Lemma El ensures that x < y will continue. Otherwise, y := v and 
X := u. The swapping of u and v in line El gives x < y. 
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We assume this lemma to be true for calls of Reorder up to a certain tree 
level. If A 7^ 0, then there is a v such that v ^ v ^ y, otherwise v := v = y. 
If 5 7^ 0, then there is a £t such that x ^ u —>■ u, otherwise u := u = x. Hence 

V ^ y < X ^ u. The for-loops of lines [7| and |H1 will call Reorder(m, v). 
By the inductive hypothesis, this will ensure x < y. According to Lemma |21 
further calls to Reorder will maintain x < y. □ 

Theorem 4. The algorithm detects a cycle if and only if there is a cycle in 
the given edge sequence. 

Proof. First, we show that within a call to Insert(m, f), there are 

paths V ^ v' and u' ^ u for each recursive call to Reorder(m', v'). This is 
trivial for the first call to Reorder and follows immediately by the definition 
of A and B for all subsequent recursive calls to Reorder. This implies that 
if the algorithm indicates a cycle in line ^ of Reorder, there is indeed a 
cycle u ^ V ^ v' = u' ^ u. In fact, the cycle itself can be computed using 
the recursion stack of the current call to Reorder. 

"<^=": Consider the edge {u^v) of the cycle v ^ u ^ v inserted last. Since 

V ^ u before the insertion of this edge, the topological order computed will 
have V < u (Theorem P) and therefore, Reorder(m, v) would be called. In 
fact, all edges in the path v ^ u will obey the current topological ordering 
and by Lemma |2l it will remain so for all subsequent calls of Reorder. We 
prove by induction on the number of nodes in the path v ^ u (including 
u and v) that whenever v ^ u and Reorder(m, v) is called, it detects 
the cycle. A call of Reorder(m', v') with u' = v' or Reorder(m', v') with 
v' u' clearly reports a cycle. Consider a path v^x^y^uoi length 
k > 2 and the call of Reorder(m, f ). As noted before, v < x < y < u 
before the call to Reorder(m, f ). Hence x & A and y & B and a call to 
Reorder(|/, x) will be made in the for loop of lines [7| and |H1 As y x 
has k — 2 nodes in the path, the call to Reorder(j/, x) (by our inductive 
hypothesis) will detect the cycle. □ 



8 



4 Runtime 



Theorem 5. Online topological ordering can be computed using 0{n^'^/t) 
bucket inserts and deletes, 0{n^/t) bucket collect-all operations collecting 
0{n^t) elements, and 0{n^'^ + n^t) operations. 

Proof. Lemma [7| shows that REORDER is called 0{n^) times. Lemma^lshows 
that the calculation of the sets A and B over all calls of Reorder can be 
done by 0{n^/t) bucket collect-all operations touching 0{nH) edges, and 
0{n'^'^ _j_ ^2^^ operations. In Lemma IT^ we prove that all the updates can 
be done by 0{v?'^ /t) bucket inserts and deletes. 

As for lines [7| and |Hl we first merge the two sorted sets A and 5, which 
takes 0(|y4| + \B\) operations. For a particular node v' G {v} U A, we can 
compute the set V = {u' : {u' E BU {u}) A {u' > v')} (as required by linelHI) 
using this merged set in complexity 0(1 + |^'|), which is also the number 
of calls of Reorder emanating for this particular node. Summing over the 
entire for loop of line d the total complexity of lines d and |H1 is 0{\A\ + 
|-B| + #(calls of Reorder emanating from here)). Since by Lemma |Hl the 
summation of |y4| + |S| over all calls of Reorder is 0{n^) and by Lemma [7| 
the total number of calls to Reorder is also 0{n'^), we get a total of O(n^) 
operations for lines [7| and |H1 Putting everything together, the theorem 
follows. □ 

Lemma 6. Reorder is local, i.e., a call to REORDER(u,f) does not affect 
the topological ordering of nodes w such that either w < v or w > u just 
before the call was made. 

Proof. This theorem can be proved by induction on the level of recursion 
tree of the call to Reorder(m, f ). For the leaf node of the recursion tree, 
|y4| = = and the topological order of u and v is swapped, not affecting 
the topological ordering of any other node. 

We assume this lemma to be true up to a certain tree level. To see that it is 
valid even for a level higher, note that the arrays A and B contain elements 
w such that v < w < u. Since each call of Reorder in the for-loop of line[7| 
and IHl is from an element of A to an element of B and all of these calls are 
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themselves local by our induction hypothesis, this call of Reorder is also 
local. □ 

Lemma 7. Reorder is called O(n^) times. 

Proof. Let u and v be arbitrary nodes. Let us consider the first time, Re- 
ORDEr(m, v) is called. If A = B = (I), u and v will be swapped. Otherwise, 
Reorder(m', v') is called recursively for all v' G {v} U A and u' e B U {u} 
with u' > v'. The order in which we make these recursive calls and the 
fact that Reorder is local (Lemma ^ ensures that Reorder(m, v) is not 
called except as the last of these recursive calls. In this second call to Re- 
ORDER(M,f), A = B = ill. To see this consider all v' e A and u' e B 
{A and B from the first call of Reorder(m, f )). Reorder(u, f ') and Re- 
ORDEr(m', v) must have been called within the for- loop of the first execution 
of Reorder('u, v) before this second call was made. Therefore it follows 
from Lemma El and Lemma El that before the second call, u < v' and u' < v 
for all v' & A and u' E B. Hence u and v will be swapped at the latest in the 
second call of Reorder(m, f ). Since Reorder(m,w) is only called if w < m, 
Reorder(m, v) will not be called again. Hence, Reorder(m, v) is called at 
most two times for each node pair {u,v). □ 

Lemma 8. The summation of \A\ + \B\ over all calls of Reorder is 0{n'^). 

Proof. Consider arbitrary nodes u and v' . We prove that for all f G V^, 
v' E A happens only once over all calls of Reorder(u, f ). This proves that 
Yl 1^1 — f*^^ such calls of Reorder(u, f ). Therefore summing up for 
all u e V , J2 \^\ ^ iT''^ over all calls of Reorder. 

In order to see that for all v E V, v' E A happens only once over all 
calls of Reorder(m, f ), observe that v' E A implies that v' < u before 
Reorder(m, v) was called. In particular, v' < u before the call of Re- 
ORDER(M,f') in the for-loop of Reorder('U, f ) (follows from the order of 
recursive calls) and by Lemma ^ u < v' after this call. Therefore, v' ^ A 
for a call of Reorder(m, w) for any node w after this call. The same is 
true for all calls of Reorder(m, w) before this call as otherwise u < v' even 
before the beginning of the current call of Reorder(m, v) and v' ^ A for 
the current call. Also, v' ^ A for any of the recursive calls of this call to 
Reorder(m, f '). This follows from the order in which we make the recursive 
calls and the fact that Reorder is local (Lemma El). 
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Analogously, it can be proved that for arbitrary nodes v and v' and for all 
u & V, v' & B happens only once over all calls of Reorder(m, f ). The 
proof for X] l-^l — follows similarly and it completes the proof for this 
lemma. □ 

Lemma 9. Calculating the sorted sets A and B over all calls of Reorder 
can he done by 0{n^/t) bucket collect-all operations touching a total ofO{n'^t) 
elements and 0{n^'^ + n^t) operations for sorting these elements. 



Proof. Consider the calculation of set A in a call of Reorder(u, f ). As 
discussed before in Section I^TTl we look at the out adjacency list of u, stored 
in the form of buckets. In particular, we will need 0{d{u, v)/t) bucket collect- 
all operations touching 0(|A| + t) elements to calculate A. The additional 
worst-case factor of t stems from the last bucket visited. Summing up over 
all calls of Reorder, we get 0[^d{u,v) /t) collect-alls touching Xld^l + 
\B\ + t) elements. Since d{u,v) < n for every call of REORDER(M,f) and 
there are O(n^) calls of Reorder (Lemma [7j), there are 0{n^/t) bucket 
collect-all operations. Also, since ^(|v4| + \B\) = O(n^) by Lemma |Hl the 
total number of elements touched is 0{n^ + X]^) = 0{nH). Since the keys 
are in the range {1 . . n}, we can use a two-pass radix sort to sort the elements 
collected from the buckets. The total sorting time over all calls of Reorder 
is E(2(|^| + t) + v^) + E(2(|5| +t) + ^)= 0(^2-5 + nH). □ 

Lemma 10. Each node-pair is swapped at most once. 



Proof. Reorder(m, v) is called only when v < u. Once a swapping happens, 
u < V. By Lemma El it will remain so for all calls of Reorder thereafter. 
Therefore, Reorder(u, v) is never called again and u and v will not be 
swapped again. □ 

Lemma 11. X] ^(^' ^) ~ 0{n^^'^) where the summation is taken over all calls 
of Reorder in which u and v are swapped. 

Proof. Let T* denote the final topological ordering and 

X{T*{ ) T*{ )) { d{u,v) if and when Reorder(m, w) leads to a swapping 
' otherwise 
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Since by Lemma ^1 any node-pair is swapped at most once, the variable 
X{i,j) is clearly defined. Next, we model a few linear constraints on X{i,j), 
formulate it as the linear program and use this LP to prove that niax{^. j X{i, j)} 
0(?T,^/^). By definition of d{u,v) and X{i,j), 

< X{i,j) < n for all i,j G [1 . .n]. 

For j < i, the corresponding edges (T* ~^{i),T* ^^(j)) go backwards and 
thus are never inserted at all. Consequently, 

X{i,j) = for all j < i. 

Now consider an arbitrary node u, which is finally at position i, i.e., T*{u) = 
i. Over the insertion of all edges, this node has been moved left and right via 
swapping with several other nodes. Strictly speaking, it has been swapped 
right with nodes at final positions j > i and has been swapped left with 
nodes at final positions j < i. Hence, the overall movement to the right is 
J2j>i-^ihj) cind to left is X]j<i ^(i? 0- Since the net movement (difference 
between the final and the initial position) must be less than n, 

X{i, j) - X{j, i) <n for aA\ 1 < i < n. 

j>i j<i 

Putting all the constraints together, we aim to solve the following linear 
program. 

max such that 

l<j<n 

i<i<" 



(i) X{i, j) = for all 1 < 2 < n and 1 < j < i 

(ii) < X{i,j) < n for all 1 < i < n and i < j <n 

R E,>i X{t,j) - Ej<iX{j,t) < n - 1 for all 1 < 2 < n 

Note that these are necessary constraints, but not sufficient. But this is 
enough for our purpose as an upper bound to the solution of this LP will 
give an upper bound for the Yl-^ihj) iii our algorithm. In order to prove 
the upper bound on the solution to this LP, we consider the dual problem 



mm 



n 



0<j<n 0<«<n 
i<i<n 



such that 
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(i) Yi.n+j > 1 for all < 2 < 77, and for all j < i 

(ii) Yi.n+j + Yn'^+i — Yn-2+j > 1 for all < z < ?T, and for all j > i 

(iii) > for all < i < + n 

and the following feasible solution for the dual: 

Yi.n+j = 1 for all < i < n and for all < j < i 

Yi.n+j = 1 for all < z < and for a\\i<j<i + l + 2^Jn 

Yi.n+j = for all < 2 < n and for all j > i + 1 + 2^/n 

Yn2+i = y/n — i for all < z < ra. 

This solution has a value of + 2ni + ^^XliLi ~ 0{ni), which by the 
primal-dual theorem is a bound on the solution of the original LP. 

In fact, it can be shown that there is a solution to primal LP whose value is 
0{n2)^ namely 

X{i, j) = for all < i < n and for all < j < i 



X{i,j) = n for all < i < n and for a\\ i < j < i + [ vi+s^-i ^ 
X{i,j) = for all < i < n and for all j >i + [ ^^'^ 1- 



□ 

Lemma 12. Updating the data structure over all calls of Reorder requires 
0{n^^^ /t) bucket inserts and deletes. 

Proof. Our data structure requires 0{d{u,v) n/t) bucket inserts and deletes 
to swap two nodes u and v. Each node pair is swapped at most once (cf. 
Lemma fTUj) . Hence, summing up over all calls of Reorder(m, f) where u 
and V are swapped, we need d{u, v) n/t) = 0{n^-^/t) bucket inserts and 
deletes using Lemma [TT] □ 
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5 Bucket data structure 



We get different runtimes and space requirements of our algorithm depending 
on the data structures of the buckets used: 

(a) Balanced binary trees: Balanced binary trees give us 0(1 + logr) time 
insert and delete and 0(1 + r) time collect-all operation, where r is the 
number of elements in the bucket. Therefore, by Theorem El the total 
time required will be 0{nH+n^^^ \ogn/t). Substituting t = n°'^^\/logn, 
we get a total time of 0{n'^''^^y/logn). The total space requirement will 
be Oln"^) as a balanced binary tree needs 0{t) nodes for storing at most 
t elements. 

(b) ra-bit array: A bucket that stores at most t elements can be kept as 
an n-bit array, where each bit is or 1 depending on whether or not 
the element is present in the bucket. Also, we can keep a list of all 
elements in the bucket. To insert, we just flip the appropriate bit and 
insert at the end of the list. To delete, we just flip the appropriate bit. 
To collect all, we go through the list and for each element in the list, 
we check if the corresponding bit is 1 or 0. If it is 0, we also remove 
it from the list. This gives us constant-time insert and delete and the 
time for collect-all operation will be the total output size plus the total 
number of delete. Each delete is counted once in collect-all as we remove 
the corresponding element from the list after the first collect-all. By 
Theorem the total time required will be 0{n'^t + n^'^/t), giving us 
0(n^'^^) for t = n"'^^. The total space requirement will be 0(n) for 
each bucket, leading to a total of 0(n^'^^) for Oiv? /t) buckets. 

(c) Uniform Hashing [TP : A data structure based on uniform hashing cou- 
pled with a list of elements in the bucket operated in the same way as 
the n-bit array will give an expected constant-time insert and delete 
and the same bound for collect-all as for the n-bit array. This gives an 
expected total time of 0{n^t + n^'^/t). With t = nP''^^ this yields an 
expected time of 0(n^'^^). Since the hashing based data structure as 
described in [TT] takes only linear space, the total space requirement is 
0(n2). 



14 



6 Empirical Comparison 



6.1 Configuration 

We conducted our experiments on a 2.4 GHz Opteron machine with 8GB of 
main memory running Debian GNU/Linux. For PK, MNR, and AHRSZ we 
used the C++/Boost based implementation of David J. Pearce [T5]. For our 
algorithm (AFM), we implemented variant (b) of Section fusing C++/STL. 
All codes were compiled using gcc 3.3 in 32-bit mode and optimization level 
"-03". The timings were measured using the gettimeof day function of 
<sys/time.h> and all the results are averaged over 10 runs each. 



6.2 DAG classes considered 

We first consider random edge insertion sequences leading to a complete 
DAG. For m < , this will result in a random DAG, similar to the G{n, m) 
random graph model of Erdos |21 E] ■ On a random edge sequence, all the 
algorithms are quite fast and none of them encounters its worst-case behavior. 
Therefore, we consider a particular sequence of edges which we believe is a 
hard instance of the problem. This edge sequence is similar to the worst-case 
sequence given by Katriel et al. for their algorithm. On this sequence, PK, 
MNR and AHRSZ (the variant choosing the smallest permitted priority) face 
their worst-case of VLip?) operations, while our algorithm takes f2(n^'^) time 
complexity. This sequence of edges is depicted in Fig. |21 

For an example with n 
nodes, we divide the set of 
nodes into four blocks of dif- 
ferent sizes: block 1 consist 
of nodes [0..n/3), block 2 
of nodes [n/3 . . 72/2), block 
3 of nodes [n/2 . . 2n/3), and 
block 4 of nodes [2?t,/3 . .n). 
First, we insert — 4 edges 

such that within each block, the vertices form a directed path from left to 
right. Then we insert the following edges. 




Figure 2: Our hard-case graph 
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(a) V j G [0..n/3) \/ k E [0..n/6) : add edge(j, k + n/2), 

(b) V j G [0..n/Q) : add edge(2j, j + n/3) and edge(2j + 1, j + n/3), 

(c) V j G [0..n/6) V A; e [0..n/3) : add edge(j + n/3, k + 2n/3), 

(d) V j e [0..n/6) V A; G [0..n/6) : add edge(j + n/2, k + n/3), 

where V denotes going from left to right in the for-loop and V the other way 
around. Similar sequences, which force AHRSZ to encounter its asymptotic 
worst-case complexity, can be chosen for all variants of AHRSZ. 

6.3 Results 

Fig. El shows the runtimes of the four algorithms in consideration for random 
edge sequences leading to complete DAGs with varying number of vertices n 
(and with m = (2))- We see that AFM is a constant factor of 2-4 away from 
AHRSZ, MNR and PK. 

Fig. m shows the average runtimes for random graphs with n = 1000 and 
a varying number of edges. AFM looses a lot in the first O(nlogn) edges 
because in this phase, updating the data-structures after every swapping 
proves very costly. But after that, the curves between AFM and PK/MNR 
are almost parallel, while the slope for AHRSZ is around 2 times that of 
AFM. For practical purposes, we believe therefore that a hybrid approach 
would perform best. That is, one inserts the first 0{n logn) edges with either 
PK or KB and then inserts the remaining edges with our algorithm. 

Fig. shows the runtimes of the four algorithms in consideration on the 
class of hard edge sequences described before. The difference in asymptotic 
behaviour as discussed before is clear from the graph. For n = 8000, AFM 
is 2 times faster than MNR, 3.6 times faster than PK, and 30 times faster 
than AHRSZ. 
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number of vertices n 

Figure 3: Experimental data on full random graphs with varying n 



0.4 




number of edges m 



Figure 4: Experimental data on random graphs with n — 1000 and varying m 




number of vertices n 

Figure 5: Experimental data on a class of hard instances with varying n 
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7 Discussion 



We have presented the first o{n^) algorithm for online topological ordering. 
We also implemented this new algorithm and compared it with previous ap- 
proaches, showing that for certain hard examples, it outperforms PK, MNR, 
and AHRSZ, while it is still competitive on random edge sequences leading to 
complete DAGs. The only non-trivial lower bound for this problem is by Ra- 
malingam and Reps ^[| , who show that an adversary can force any algorithm 
maintaining explicit labels to need Q{nlogn) time complexity for inserting 
n — 1 edges. There is still a large gap between this, the trivial lower bound 
of fi(m), and the upper bound of 0(min{m^'^ -|- logn, m^'^ logn, n^''^^}). 
Bridging this gap remains an open problem. 
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